Phonetic Distance Measures for Speech Recognition Vocabulary and Grammar Optimization
نویسندگان
چکیده
This paper reports on the correlation between word confusion matrices from Word-Error-Rate (WER) experiments and different phonetic distance measures. The investigated phonetic distance measures are based on the minimum-edit-distances between phonetic transcriptions and the distances between Hidden-Markov-Models (HMM). We show that phonetic distance measures are correlated with word confusion. The correlations between word confusion of a speech recognizer and phonetic distance are useful for a speech recognition grammar developer or a spoken dialog system designer in developing efficient grammars and dialogs. Furthermore the measures can be used for evaluating the quality of grammars in terms of phonetic confusability of words/utterances or interpretations. An extension of these measures to grammar optimization is discussed.
منابع مشابه
The SpeeD Grammar-based ASR System for the Romanian Language
This paper describes the grammar-based automatic speech recognition system for the Romanian language developed by the Speech and Dialogue Research Group. The paper links to previous work for the issues related to large vocabulary speech recognition and focuses on the specific optimization work done for several closed-vocabulary, grammar-based speech recognition tasks. Among the specific problem...
متن کاملEffect of foreign accent on speech recognition in the NATO n-4 corpus
We present results from a series of 151 speech recognition experiments based on the N4 corpus of accented English speech, using a small vocabulary recognition system. These experiments looked at the impact of foreign accent on speech recognition, both within non-native accented English and across different accents, with particular interest in using context free grammar technology to improve cal...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملSpeech recognition for east Slavic languages: the case of Russian
In this paper, we present a survey of state-of-the-art systems for automatic processing of recognition of under-resourced languages of the Eastern Europe, in particular, East Slavic languages (Ukrainian, Belarusian and Russian), which share some common prominent features including Cyrillic alphabet, phonetic classes, morphological structure of wordforms and relatively free grammar. A large voca...
متن کاملA New Phonetic Model for Continuous Speech Recognition Systems
The main goal of this work is to describe a new model for a large vocabulary continuous speech recognition system using a phonetic-phonological approach. This work proposes a statistical phonetic structure, applied at the phoneticphonological level, to improve the speech recognition performance in systems with phonetic-phonological modeling. It is showed that the general likelihood scores are i...
متن کامل